Detecting Gene Relations from MEDLINE Abstracts

نویسندگان

  • Matthew J. Stephens
  • Mathew J. Palakal
  • Snehasis Mukhopadhyay
  • Rajeev R. Raje
  • Javed Mostafa
چکیده

Research in bioinformatics in the past decade has generated a large volume of textual biological data stored in databases such as MEDLINE. It takes a copious amount of effort and time, even for expert users, to manually extract useful information embedded in such a large volume of retrieved data and automated intelligent text analysis tools are increasingly becoming essential. In this article, we present a simple analysis and knowledge discovery method that can identify related genes as well as their shared functionality (if any) based on a collection of relevant retrieved relevant MEDLINE documents. The relative computational simplicity of the proposed method makes it possible to process and analyze large volumes of data in a short time. Hence, it significantly contributes to and enhances a user's ability to discover such embedded information. Two case studies are presented that indicate the usefulness of the proposed method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Validating Candidate Gene-Mutation Relations in MEDLINE Abstracts via Crowdsourcing

We describe an experiment to elicit judgments on the validity of gene-mutation relations in MEDLINE abstracts via crowdsourcing. The biomedical literature contains rich information on such relations, but the correct pairings are difficult to extract automatically because a single abstract may mention multiple genes and mutations. We ran an experiment presenting candidate gene-mutation relations...

متن کامل

Genescene: Relations in Biomedical Text 1 Genescene: An Ontology-enhanced Integration of Linguistic and Co-occurrence based Relations in Biomedical Texts

The increasing amount of publicly available literature and experimental data in biomedicine makes it hard for biomedical researchers to stay up-to-date. Genescene is a toolkit that will help alleviate this problem by providing an overview of published literature content. We combined a linguistic parser with Concept Space a co-occurrence based semantic net. Both techniques extract complementary ...

متن کامل

A mutation-centric approach to identifying pharmacogenomic relations in text

OBJECTIVES To explore the notion of mutation-centric pharmacogenomic relation extraction and to evaluate our approach against reference pharmacogenomic relations. METHODS From a corpus of MEDLINE abstracts relevant to genetic variation, we identify co-occurrences between drug mentions extracted using MetaMap and RxNorm, and genetic variants extracted by EMU. The recall of our approach is eval...

متن کامل

Automatic Recognition of Topic-Classified Relations between Prostate Cancer and Genes from Medline Abstracts

To recognize instances of medical information concerning prostate cancer and its relevant genes, we developed a machine learning-based relation recognizer using rich contextual features. We collected prostate cancer-related abstracts from Medline. We then constructed an annotated corpus of prostate cancer and gene relations, which consisted of six topic − classified categories, with more detail...

متن کامل

miRTex: A Text Mining System for miRNA-Gene Relation Extraction

MicroRNAs (miRNAs) regulate a wide range of cellular and developmental processes through gene expression suppression or mRNA degradation. Experimentally validated miRNA gene targets are often reported in the literature. In this paper, we describe miRTex, a text mining system that extracts miRNA-target relations, as well as miRNA-gene and gene-miRNA regulation relations. The system achieves good...

متن کامل

Ranking Gene-Drug Relationships in Biomedical Literature Using Latent Dirichlet Allocation

Drug responses vary greatly among individuals due to human genetic variations, which is known as pharmacogenomics (PGx). Much of the PGx knowledge has been embedded in biomedical literature and there is a growing interest to develop text mining approaches to extract such knowledge. In this paper, we present a study to rank candidate gene-drug relations using Latent Dirichlet Allocation (LDA) mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing

دوره   شماره 

صفحات  -

تاریخ انتشار 2001